perm filename NOTES.SYS[LSP,JRA]1 blob sn#114480 filedate 1974-08-01 generic text, type T, neo UTF8
.SEC(Systems Organization)

There are  many applications of  data structures  which arise
very  naturally in the domain  of systems programming.   This section
will begin  with  a  very  brief historical  description  of  systems
organization, from the bare machine to multi-processing.
	In  the  early  days of  computers,  operating  systems  were
non-existent.   You would  sign up for  a couple of  hours of machine
time, appear  with your  card deck  or paper  tape,  and operate  the
machine.  Debugging devices consisted of a sharp pointed stick, and a
quick  hand on the console  switches.  This  means of programming was
very satifying to  many of the  programmers, but machine  utilization
left something  to be desired.   Much of  the time the  machine would
sitt idle (stopped) as  you would think  about what to  do next.   As
processing speed and  machine costs increased it became  evident that
this  mode  of operation  had to  go.   The  first  operating systems
consisted of a dull  slow object called  an operator and a  satellite
computer on which an input tape  consisting of many separate jobs was
created.   Each  job  on the  input tape  was swquentially  read into
memory, executed and the output presented to an output tape.  If some
abnormal  behavior  was  detected  in  your program,  you  were  also
presented with an  often uninspiring  octal dump of  the contents  of
memory. Whenever a job required say, a  data tape, the operator would
be notified,  and the machine would wait  until the tape was located,
mounted, and readied.  There was a moderately small  resident monitor
which controlled  the input, output  and perhaps the timing  of jobs.
Gross  utilization of the machine was  increased, however at the cost
of  easy  debugging.    This  mode  of  operation   is  called  batch
processing.

The CPU  (central processing  unit) still  locks up when  I/O
devices  aren't ready and the  user can physically  halt the machine.
For this  model and  most  of the  future discussion,  it  simplifies
matters to  assume that the  monitor resides  in a separate  core box
from that which contains user programs.

Thus:

.GROUP SKIP 10;







The  user programs  are  loaded  into  explicit (hard)  locations  in
memory,  and it is  assumed that  the user has  access to all  of the
addressing space of his core box.

In  an  attempt  to decrease  the  overhead  on  waiting  for
external actions (I/O  requests or operator intervention) we attach a
reasonably  fast  storage  device  (and  increase  the  size  of  the
monitor).  This storage device is  capable of holding many user jobs.
Whenever  a user (loser)  program requires a service  which cannot be
satisfied immediately,  the  monitor will  swap  the program  out  of
memory onto  this storage device, initiate action  which will tend to
satisfy the request, and bring  another job into core, beginning  its
execution.   This new job  may either  be the next  job on the  input
stream, or  a job which was swapped out and  now is ready to run. The
user still is given  the whole addressing  space, he is still  loaded
into  explicit memory  locations, but  the monitor  must now  be more
cleaver.   It must  be able  to recognize (or  `trap') requests which
would lock up the CPU.  It must be able to make decisions as to which
job to run next.





.GROUP SKIP 10;





Clearly  there  are   grossinefficiencies  in  the   present  scheme.
Allowing, or in  fact requiring, that each user have access to all of
memory is too generous.  Assume that our memory size is 32→ words and
assume that we  have 16K jobs which  are ready to run.   We should be
able  to load one job  into locations 0 - (16K-1)  and load the other
job into the  other half of memory.   When one job  requires external
service, we  should be able to  switch the CPU to  begin execution of
the other job provided that we save all information about the current
state of the suspended jog.  The point is  that it takes time to swap
jobs in and out  of core so thaat anything we can do to decrease this
overhead is  a winner.   What  are the  added  requirements for  this
scheme?  Manily we must have some  scheme for keeping the jobs out of
each  other's hair.  This  is usually done by  a hardware addition to
the machine  called the protection  register.
In this simple scheme the protection register  would be loaded by the
monitor  with the  upper  and lower  bounds on  the  addressing space
accessible to the current job.  Then every memory reference made by a
program is filtered  through the protection register.   Obviously the
instruction  to  load  protection register  should  be  restricted to
execution by the monitor.

What are  the inefficiencies in  this scheme?   Consider what
happens  when  we have  to  swap a  job  out of  memory.    Since the
addresses  used  by  the  program  are  explicitly  tied   to  memory
locations, we must swap that job  back into exacly the same locations
if  it is  to run correctly.   Consider  two 16K  programs whose only
effect is to execute `jump-self' loops ( L (JRST  L) ). Their initial
load might look like:


.GROUP SKIP 10;




If we swap out the top job and try to swap it back into the lower 16K
at some later time, we lose big.


.GROUP SKIP 10;





But clearly if the bottom  16K is available we should be  able to use
it.  We want to be able to swap our programs into any available block
of core which is of sufficient size.

To  accomplish  this we  add a  new piece  of hardware,  the
relocation register. Now our loader loads all our programs as if they
begin at location, 0.  Thus in the above example:


.GROUP SKIP 10;





To get the  appropriate effect when  we execute any program,  we bias
every  memory reference by  actual address assigned  to the program's
location 0.   This  bias is  put in  the relocation  register by  the
monitor before it begins execution of  the program.  With this scheme
we  can run any jobs in any block  of core which is the correct size.
All we  need do  is load  the  appropriate offset  in the  relocation
register.




.GROUP SKIP 10;



Now we can dispense with the two-core box approach.  Just use
one box and bias  every access in a user program by the length of the
monitor plus the usual offset:




.GROUP SKIP 10;




The actual harrdware can also be simplified now since the lower bound
on every user's addressing space is always zero.

Clearly, more refinements  need to be  made.  At  present the
only way  to interrupt a job is if he  terminates,  or calls for some
external action, or  attempts to perform  some illegal or  restricted
instruction.   Any operating system needs  to be able to  monitor the
amount of  CPU time spent on a job.  So we should have a programmable
clock  which can  be  set  by the  monitor  and  which will  send  an
interrupt to the at a  specified time.  Actually other hardware needs
to  be  added  to  our  entourage  if  we  intend  to  do  reasonable
time-sharing.  A  sophisticated   priority  interrupt  system   is  a
necessity.  Since  many of the applications of data structures appear
in time-sharing systems, we will say a bit about the behavior of such
systems.

.SS(Time-sharing organization)

What make  time-sharing viable  is the tremendous  difference
between human reaction time and machine speeds.  In a period of a few
seconds a well  designed t-s  system can satisfy  simple requests  by
many users.   If  a user must  wait a significant  length of  time to
obtain  a  response  to a  simple  command, then  your  system  is in
trouble.   The heart  of a  rapid  response is  a priority  interrupt
system.

A time-sharing monitor  must service terminals,  mass storage
devices and control the dynamic allocation of its memories.
Consider the  care and feeding  of a relatively  fast storage
device,  say  a  disk,  and  its  interaction  with the  sending  and
receiving of  characters from user  terminals. Most  disks require  a
sequence of  commands be  issued before an  actual data  transfer can
take  place.  Assume that there are two  commands: a seek to find the
appropriate area on the device, followed by the transfer command.  If
the CPU were tied up from the beginning of the seek to the end of the
transfer, significant  amounts of  CPU time  would be  lost.   If  we
issued the  seek, went  off to  do other  calculations, but were  not
responsive  enough when  the  seek was  completed we  might  miss the
chance to make the transfer due to intervening rotation of  the disk.
What we can do  is put the disk on a  high priority interrupt channel
and  the terminal keyboards on a medium  priority channel.  Issue the
seek, go back  to computation, perhaps  servicing the keyboards;  and
when the disk  is in position we will get  a high priority interrupt,
freezing the  state  of the  computation;  at this  time,  begin  the
transfer of information, dismissing the  interrupt and continuing the
computation.  Thus higher priority requests can interrupt lower ones;
any  lower priority  requests must  be suspended  or saved  until the
higher one is completed.  You might decide to design the  hardware to
allow recursive interrupts  on a particular channel  or might require
completion  before another  request can  be serviced.  

What about  the  data structures  used in  a complex  system.
Data structures  are used heavily in the  scheduling algorithms.  the
t-s monitor must make decisions about which jobs in the system are to
run.   These decisions are  based on such  simple things as:  the job
isn't  requesting service--dormant; the  job is waiting  for some I/O
device--I/O  wait; or  decisions  must  be based  on  more  difficult
criteria: the  job is ready to  run but is too big  for the currently
available allotment of core;  there is a job  ready which needs  fast
response or has been waiting inordinately long for service.

In  general a  vast  array  of  information can  be  used  in
deciding which  job to run next.   Usually these decisions take shape
by reqquiring that  the jobs currently in  the system be  partitioned
into  a  finite  set  of  waiting-lines  or  queues.  The  scheduling
algorithm manipulates  these queues as the status of the jobs change.
A queue  is  a  data structure.    Queues  are also  called  first-in
first-out lists (FIFO) or pop-up  lists.  In LISP parlance you append
new queue-elements to the end of  the list and take the elements  off
of the  front of  the  list.   It should  be obvious  where the  name
waiting-line comes from.

Queues are also used in the buffering schemes of I/O devices.
As we type `LOGIN.....'  to a t-s system, it is usually the case that
the character string first goes into a system buffer and when the CPU
is free,  the command is decoded.   Obviously we want  the monitor to
see  the character  string in the  same order  in which  we typed it.
That is the buffer is acting like a queue.

A queue can be implemented as simply a list with a pointer to
the front,  and a pointer to the end.   When you add something to the
end, you update one  pointer; wen you take  an element off the  front
you update  the other pointer.   There is  the obvious check  to make
sure  that you can recognize the empty  queue: when the front pointer
meets the end pointer:


.GROUP SKIP 6;


As with  stacks, the queue  is usually not  represented as a  list of
argitrary length.  A queue can be represented  as a sequential set of
memory locations, still with two pointers.  What happens when we have
filled the  last cell in  the sequence.  Well,   if some  process has
been removing items from the queue, then the front pointer has moved:


.GROUP SKIP 6;

In this case  the cells from  the first cell  in the sequence  to the
current position of  the front pointer are available.  Use `em.  That
is a queue is usually implemented as a sequential circular  list with
two pointers.   In this implementation the tests for  the empty queue
are  slightly more complex, but are still  obvious.  Notice too, that
we must now also test for the full queue.

In  system applications  it  is  usually  the case  that  one
process is filling the queue and one process is emptying it.  In this
application, a full  queue should simply put  the filling process  to
`sleep' (or suspend it), and when a queue becomes empty, the emptying
process  should  be  suspended.    There  are  some  interesting  and
non-trivial  problems   which   arise   in  connection   which   such
`cooperating processes'.

There are some interesting data  structure applications which
are connected with  the allocationa dn (?) liberation of storage. The
monitor  must  be  able  to   allocate  storage  to  jobs  as   their
requirements become known  and must also be able to  recover areas of
memory which  are no longer being used.  Similarly, since an integral
part of a large system is a file structure, the  monitor must be able
to handle  the demands of file  maintenance.  These  two problems are
related  but  solutions  for  core  allocation  are  not  necessarily
reasonable ones for file allocation.